Robust Estimation and Testing 1 A Generally Robust Approach To Hypothesis Testing in Independent and Correlated Groups Designs by

نویسندگان

  • H. J. Keselman
  • Lisa M. Lix
چکیده

Standard least squares analysis of variance methods suffer from poor power under arbitrarily small departures from normality and fail to control the probability of a Type I error when standard assumptions are violated. These problems are vastly reduced when using a robust measure of location; incorporating bootstrap methods can result in additional benefits. This paper illustrates the use of trimmed means with an approximate degrees of freedom heteroscedastic statistic for independent and correlated groups designs in order to achieve robustness to the biasing effects of nonnormality and variance heterogeneity. As well, we indicate when a boostrap methodology can be effectively employed to provide improved Type I error control. We also illustrate, with examples from the psychophysiological literature, the use of a new computer program to obtain numerical results for these solutions. Descriptors: Heteroscedastic variances, nonnormal distributions, robust estimators, bootstrapping, independent and correlated groups. Robust Estimation and Testing 3 Robust Estimation and Testing On a number of occasions, has published articles that are intended Psychophysiology to identify problems with traditional methods of analyzing psychophysiological data and indicate how valid and reliable results could generally be obtained by adopting newer methods (e.g., Keselman, 1998). Our intention in the present article is to extend that body of literature by offering a framework for statistical tests (omnibus and general focused hypothesis tests) that are robust to the biasing effects of variance heterogeneity and nonnormality in both independent and correlated groups designs. Research has shown that the deleterious effects of (co)variance heterogeneity on the usual omnibus analysis of variance (ANOVA) F and linear contrast tests (Student's t) generally can be overcome by adopting Welch (1938, 1951)-type statistics (see Lix & Keselman, 1998; Keselman, Lix & Kowalchuk, 1998), that is, statistics that do not pool across heterogeneous sources of variability and where error degrees of freedom are estimated from the sample data. The biasing effects of nonnormality can also generally be overcome by adopting robust measures of central tendency and variability, that is, by using trimmed means and Winsorized (co)variances rather than the usual least squares estimators (see Lix & Keselman, 1998; Wilcox, 1997). A number of papers have demonstrated that one can indeed generally achieve robustness to nonnormality and (co)variance heterogeneity in unbalanced independent and correlated groups designs by using robust estimators with heteroscedastic test statistics (Keselman, Algina, Wilcox, Kowalchuk, 2000; Keselman, Kowalchuk & Lix, 1998). Further improvement in Type I error control is often possible by obtaining critical values for test statistics through bootstrap methods. Such improvement has been demonstrated with statistics for independent group designs (Wilcox, Keselman, & Kowalchuk, 1998). Wasserman and Bockenholt (1989) introduced the technique of bootstrapping to psychophysiologists, defining the methodology and indicating how various inferential problems [e.g., correlational and general linear model (GLM) analyses] could be addressed via bootstrapping techniques. Robust Estimation and Testing 4 Our paper is a follow-up to Wasserman and Bockenholt (1989) in three important ways: (a) first, we demonstrate how bootstrapping can be applied to tests of significance, rather than just to interval estimates around population parameters; (b) we discuss the use of the bootstrapping methodology with robust estimators (viz. trimmed means) rather than the usual least squares estimates; and (c) we illustrate the use of a new computer program to produce Welch (1938, 1951)-type approximate degrees of freedom (ADF) test statistics in combination with robust estimators and/or bootstrapping. These topics are discussed both for independent and correlated groups designs. [A nontechnical exposition of robust estimation and testing can be found in Wilcox (2001).] A General ADF Test Statistic Methods that give improved power and better control over the probability of a Type I error can be formulated using a GLM ADF perspective. Lix and Keselman (1995) showed how the various Welch (1938, 1951) statistics that appear in the literature for testing omnibus main and interaction effects as well as focused hypotheses using contrasts in univariate and multivariate independent and correlated groups designs can be formulated from a GLM ADF perspective, thus allowing researchers to apply one statistical procedure to any testable model effect. We adopt their approach in this paper and begin by presenting, in abbreviated form, its mathematical underpinnings.1 A general approach for testing hypotheses of mean equality using an ADF solution is developed using matrix notation. The multivariate perspective is considered first; the univariate model is a special case of the multivariate. Consider the general linear model: Y X œ € " 0 , (1) where is an N p matrix of scores on p dependent variables or p repeated Y ‚ measurements, N is the total sample size, is an N r design matrix consisting entirely X ‚ of zeros and ones with rank( ) r, is an r p matrix of nonrandom parameters (i.e., X œ ‚ " population means), and is an N p matrix of random error components. Let 0 ‚ Yj Robust Estimation and Testing 5 (j 1, , r) denote the submatrix of containing the scores associated with the n œ á Y subjects in the jth group (cell). It is typically assumed that the rows of are Y independently and normally distributed, with mean vector and variance-covariance "j matrix [i.e., N( , )], where the jth row of , [ ], and (j j ). D D D D j j j j j1 jp j j " " " . . œ á Á Á w w Specific formulas for estimating and , as well as an elaboration of are given in Lix " Dj Y and Keselman (1995, see their Appendix A). The general linear hypothesis is H : , (2) 0 R 0 . œ where , is a df r matrix which controls contrasts on the independent R C U C œ Œ ‚ T C groups effect(s), with rank( ) df r, and is a p df matrix which controls C U œ Ÿ ‚ C U contrasts on the within-subjects effect(s), with rank( ) df p, ' ' is the Kronecker U œ Ÿ Œ U or direct product function, and 'T' is the transpose operator. For multivariate independent groups designs, is an identity matrix of dimension p (i.e., ). The U I R p contrast matrix has df df rows and r p columns. In Equation 2, C U ‚ ‚ . . œ œ á ‚ vec( ) [ ] . In other words, is the column vector with r p elements " " " T T 1 r obtained by stacking the columns of . The column vector is of order df x df [see "T C U 0 Lix & Keselman (1995) for illustrative examples]. The generalized test statistic given by Johansen (1980) is T ( ) ( ) ( ), (3) WJ T T 1 œ s R R R R . D s s  . where estimates , and diag[ /n ... /n ], a block matrix with diagonal . . D D D s s s s œ 1 1 r r elements /n . This statistic, divided by a constant, c (i.e., T /c), approximately Ds r r WJ follows an F distribution with degrees of freedom df df , and /1 U C œ ‚ / / / / / 2 1 1 1 1 œ € œ €  € ( 2)/(3A), where c 2A (6A)/( 2). The formula for the statistic, A, is provided in Lix and Keselman (1995). When p 1, that is, for a univariate model, the elements of are assumed to be œ Y independently and normally distributed with mean and variance [i.e., N( , )]. To . 5 . 5 j j 2 2 j j test the general linear hypothesis, has the same form and function as for the C Robust Estimation and Testing 6 multivariate case, but now 1, [ ... ] and diag[ /n ... /n ]. (see Lix U œ s . D œ œ s s s . . 5 5 1 r T 2 2 1 1 r r & Keselman's 1995 Appendix A for further details of the univariate model.) Obtaining Numerical Results Using an ADF Solution with Robust Estimators and/or Bootstrapping Keselman, Wilcox and Lix (2001) present a SAS/IML (SAS Institute Inc, 1999) program which can be used to obtain numerical results for the general ADF solution. The program can also be obtained from the first author's website at http://www.umanitoba.ca/faculties/arts/psychology/. This program is an extension of the program found in Lix and Keselman (1995). The general ADF solution contained in the current program can be applied with robust estimators, that is, trimmed means and Winsorized variances (covariances) and can also be used in conjunction with a bootstrapping methodology. Tests of omnibus main effects or interaction effects may be performed, in addition to tests of individual contrasts or families of contrasts. The program can be applied in a variety of research designs; several applications of the program will be explored in the following sections of this paper. The main module, which is called WJGLM, requires as input , , , , and Y C NX OPT1 OPT2 U I . By default, , but for correlated groups designs, the program user must œ p specify the elements of . The vector is a 1 r vector containing the number of U NX ‚ observations in each group or cell (i.e., the n s). It is assumed that the order of entry for j Y NX Y and correspond, so that the first n rows of correspond to the first element of 1 NX Y NX , the next n rows of correspond to the second element of , and so on. The 2 scalar can assume values of 0 or 1 only; a 0 is specified when the program user OPT1 does not want to apply robust estimation in conjunction with the ADF solution, while a 1 is specified for robust estimation. The scalar also assumes values of 0 or 1; a 0 OPT2 is specified when the program user does not want to apply the bootstrapping methodology, while a 1 is used to indicate that bootstrapping should be used. When OPT1 PER œ 1, the program user must also specify a value for , which represents the proportion of trimming (discussed later in the paper). can range in value from 0 (no PER Robust Estimation and Testing 7 trimming) to a value less than or equal to .5; a common choice might be .20, PER œ which represents a 20% symmetric trim rule. When 1, the program user must OPT2 œ also specify an integer value for the scalar , which represents the number of NUMSIM simulations for the bootstrapping methodology, and for which defines the initial SEED argument for the first call to the bootstrap simulation. can be any integer up to SEED 2 1. If 0 is specified, the computer's internal clock is used as the 31  SEED œ argument. The main module is invoked with a RUN WJGLM statement. The output of the program is determined by the user's choices for , , , and ; further details C U OPT1 OPT2 are provided in later sections of the paper. A second module, called BOOTCOM, is also included in the program. It computes the ADF solution for a family of contrasts when the program user wishes to use bootstrapping and control the familywise Type I error rate (FWR). This module, which is invoked with a RUN BOOTCOM statement, requires as input, , , , , Y C NX OPT1 NUMSIM ALPHA ALPHA C , and . sets the FWR. For this module, is used to specify a set of contrasts on the independent groups effect(s). Specific examples will help to illustrate the options that are available when using this program. Applications of the ADF Solution One-Way Independent Groups Design A great deal of evidence indicates that the traditional tests for mean equality are adversely affected by nonnormality, particularly when variances are heterogeneous and group sizes are unequal (see Lix & Keselman, 1998; Wilcox, 1995). That is, Type I error and power rates are substantially affected when these assumptions are jointly violated. In particular, depending on whether there is a positive or negative correlation between group sizes and (within-) group variances, the risk of a Type I error can be inflated or deflated relative to the nominal alpha (e.g., .05) level and correspondingly, the ! œ power to detect a treatment effect may be depressed or enhanced. Robust Estimation and Testing 8 Reductions in power occur because the usual population standard deviation ( ) is 5 greatly influenced by the presence of extreme observations (outliers) in a distribution of scores. Consequently, the standard error (SE) of the mean, /n, can become seriously 52 inflated when the underlying distribution has heavy tails (Wilcox, 1995). Thus, standard errors of t and F are relatively large and power accordingly will be depressed. One can substitute a robust measures of location, and a corresponding robust measure of scale. Trimmed means and variances based on Winsorized sums of squares enable one to obtain test statistics which achieve minimal losses in power due to nonnormality. Indeed, a considerable amount of evidence has accumulated to date supporting this position (see Wilcox, 1995, 1997, 2001). With regard to spurious rejections, many investigators have shown that better Type I error results can be obtained by using test statistics designed for heterogeneity combined with robust estimators of central tendency and variability (see Lix & Keselman, 1998; Keselman et al., 1998; Wilcox, 1995; Yuen, 1974). Though rates of Type I error improved when adopting robust estimators with heteroscedastic statistics, these improved methods were nonetheless still occasionally affected when distributions were nonnormal, variances were heterogeneous and group sizes were unequal. That is, Type I error rates did occasionally exceed .075 for .05, attaining values close to ! œ .10. Westfall and Young's (1993) results suggest that Type I error control could be improved further by combining a bootstrap method with one based on trimmed means. Wilcox et al. (1998) provide empirical support for the use of robust estimators and test statistics with bootstrap-determined critical values in one-way independent groups designs. This benefit has also been demonstrated in correlated groups designs [see Keselman, Algina, Wilcox & Kowalchuk (2000); Keselman, Kowalchuk, Algina, Lix & Wilcox (2000)]. Robust Estimation and Testing 9 For an independent groups experiment with n subjects ( n N) in each of J j j j D œ groups, and using the notation of Equation 1, (Y ), where Y is the score Y œ ij ij associated with the ith subject in the jth group (j 1,...,J; i 1,...,n ), E(Y ) , the jth œ œ œ j j j . population mean, [ ] and ( ) defines the random error term. The Y s " 0 T 1 J ij ij œ á œ . . % are assumed to be N( , ) variates, with and respectively representing the jth . 5 . 5 j 2 j j 2 j s s sample mean and unbiased variance. To test the general linear hypothesis of Equation 2, , because 1. R C C U œ œ œ j That is, is a (J 1) J matrix for which the rows represent a set of linearly Cj  ‚ independent contrasts among the levels of the independent groups factor. With respect to Equation 3, [ ... ] and diag[ /n ... /n ]. . D s œ œ s s s s s . . 5 5 1 J T 2 2 1 J 1 J Pairwise contrasts on the group means are frequently of great interest. Using Equation 2, (c ... c ), the 1 J vector of coefficients which contrasts the R C c œ œ œ jj 1 J w ‚ jth and j th means ( c 0). In other words, we test the null hypothesis H : w D . . j j jj j j œ œ w w (j j ). Á w Robust Estimation. In this paper we apply robust estimates of central tendency and variability to the ADF statistic. When researchers feel that they are dealing with populations that are nonnormal in form [Tukey (1960) suggested that outliers are a common occurrence in distributions and others have indicated that skewed distributions frequently depict psychological ( ) data] and thus subscribe to the position reaction time that inferences pertaining to robust parameters are more valid than inferences pertaining to the usual least squares parameters, then procedures, based on robust estimators, should be adopted. When trimmed means are being compared the null hypothesis pertains to the equality of population trimmed means, i.e., the s. That is, to .t test the general linear hypothesis in a one-way independent groups design we specify H : . 0 t R 0 . œ Let Y Y Y represent the ordered observations associated with the (1)j (2)j (n )j Ÿ Ÿ â Ÿ j jth group. Let g [ n ], where represents the proportion of observations that are to j j œ # # Robust Estimation and Testing 10 be trimmed in each tail of the distribution and [ ] is the greatest integer . The B Ÿ B effective sample size for the jth group becomes h n 2g . The jth sample trimmed j j j œ  mean is .s œ tj i g 1 n g (i)j 1 hj Y . (4) ! œ € 

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Testing the Exactitude of Estimation Methods in the Presence of Outliers: An accounting for Robust Kriging

Estimation of gold reserves and resources has been of interest to mining engineers and geologists for ages. The existence of outlier values shows the economic part of the deposits subject to the fact that don’t depend on the human or technical errors. The presence of these high values causes a pseudo dramatically increment in variance estimation of economical blocks when applying conventional m...

متن کامل

Robust tests for testing the parameters of a normal population

This article aims to provide a simple robust method to test the parameters of a normal population by using the new diagnostic tool called the “Forward Search” (FS) method. The most commonly used procedures to test the mean and variance of a normal distribution are Student’s t test and Chi-square test, respectively. These tests suffer from the presence of outliers. We introduce the FS version of...

متن کامل

Simultaneous robust estimation of multi-response surfaces in the presence of outliers

A robust approach should be considered when estimating regression coefficients in multi-response problems. Many models are derived from the least squares method. Because the presence of outlier data is unavoidable in most real cases and because the least squares method is sensitive to these types of points, robust regression approaches appear to be a more reliable and suitable method for addres...

متن کامل

A Robust Image Denoising Technique in the Contourlet Transform Domain

The contourlet transform has the benefit of efficiently capturing the oriented geometrical structures of images. In this paper, by incorporating the ideas of Stein’s Unbiased Risk Estimator (SURE) approach in Nonsubsampled Contourlet Transform (NSCT) domain, a new image denoising technique is devised. We utilize the characteristics of NSCT coefficients in high and low subbands and apply SURE sh...

متن کامل

Case Mix Planning using The Technique for Order of Preference by Similarity to Ideal Solution and Robust Estimation: a Case Study

Management of surgery units and operating room (OR) play key roles in optimizing the utilization of hospitals. On this line Case Mix Planning (CMP) is normally applied to long term planning of OR. This refers to allocating OR time to each patient’s group. In this paper a mathematical model is applied to optimize the allocation of OR time among surgical groups. In addition, another technique is ...

متن کامل

Integrated planning for blood platelet production: a robust optimization approach

Perishability of blood products as well as uncertainty in demand amounts complicate the management of blood supply for blood centers. This paper addresses a mixed-integer linear programming model for blood platelets production planning while integrating the processes of blood collection as well as production/testing, inventory control and distribution. Whole blood-derived production methods for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002